#python #neural search #Jina #deep learning #clip #images

Image Encoders Part Deux: Let's ramp things up

Published Sep 21, 2021 by Alex C-G


Let’s get back to how well those image encoders cope with memes. We previously saw that when we were comparing very similar (in terms of pixel value) memes, BigTransfer came out on top. But what if we’re searching variants, like Buff Doge vs Cheems?

As far as I know this meme isn’t in the dataset we’re using. That means it can’t just do a straight up pixel-match like before. So what could happen? My guess here is that:

Okay, fight!

CLIP

Query image Results

BigTransfer

Query image Top 3 results

Oddly, BigTransfer surfaced this as the fifth result, so maaaybe it’s recognizing something like the dogs? Or maybe just a fluke.

So, was I right?

Nope, off by a long shot. Perhaps there’s something weird going on with CLIP’s feature detection. Either way, that’s two for two for BigTransfer when it comes to searching memes. Though it might’ve just got lucky hitting that fifth image.

In short, both encoders suck for this kind of thing, at least out of the box.

What’s next?

I’m exhausted with memes for now. But if anyone has any ideas for what to try next (and preferably wants to take part!) ping me on Twitter at @alexcg.



*****

© 2018-2024, Alex Cureton-Griffiths